Bias In, Bias Out: Artificial Intelligence Reflects Real Discrimination

Nanda

This is a repost from my Substack. Subscribe for my latest.

AI is often framed as a purely objective, rational machine. But a more accurate metaphor, as Alice Xiang puts it, is that of a mirror. AI models learn from the data we give them, and in doing so, they reflect our world back at us, including all its existing injustices and biases.

The central challenge, then, is not to decide whether humans or machines are fairer, but to understand what this new reflection is showing us about ourselves, and how our legal systems are attempting (and failing) to polish the mirror.

Part I: The Anatomy of a Biased Machine

Algorithmic bias can be defined as systematic and repeatable errors within a computational system that create unfair outcomes, such as privileging one arbitrary group of users over others. It is, at its core, a human problem, as computers learn from the data and instructions we provide. The emergence of these systems does more than automate old prejudices; it makes them systematic, scalable, and auditable.

Bias is not a ghost that mysteriously haunts the machine; it is a foundational element introduced at every stage of its creation, from data collection to model design. This process takes subjective value judgments and launders them through a veneer of scientific neutrality that makes them harder to identify and challenge.

1. The Data: Garbage In, Gospel Out

A widely understood source of algorithmic bias is the data used to train the model. The principle of “garbage in, garbage out” is paramount; if an AI system is fed faulty or prejudiced historical data, its predictions will be equally faulty.

In 2018, Amazon developed an AI-based tool to screen job candidates. Because historical data reflected a tech industry where men held the majority of technical roles, the algorithm learned to penalize résumés that included the word “women’s”. The algorithm was not programmed to be sexist. It was programmed to find candidates who looked like past successful hires, and in doing so, it perfectly replicated the historical gender disparities in the data.

Similarly, when predictive policing algorithms are trained on historical arrest data, they do not learn where crime occurs, but rather where police have historically made arrests. If a city has a history of over-policing minority neighborhoods, the algorithm will learn to associate those neighborhoods with higher crime risk. This creates a feedback loop where increased police presence leads to more arrests, which in turn validates the algorithm’s initial prediction, perpetuating a cycle of discrimination.

2. The Algorithm

The process of designing and building an algorithm is a series of human choices, each one a potential entry point for bias. These flaws transform societal prejudice into computational logic.

Selection Bias: This occurs when the data used to train a model is not representative of the real-world population it will be used on. In 2012, a Boston startup released an app called “StreetBump“ that used a smartphone to automatically detect and report potholes as citizens drove around. In theory, this would create a perfectly objective map of road repair needs. In practice, residents in lower-income neighborhoods were less likely to own smartphones or have data plans, leading to significant under-reporting in those areas. The resulting dataset was not a map of potholes, but a map of smartphone ownership, creating a biased view of where city services were most needed.

Proxy Bias: Even when developers explicitly remove protected attributes like race or gender from a dataset, algorithms are adept at finding proxies that are highly correlated with those attributes. A US healthcare algorithm was designed to predict which patients had complex health needs and required extra medical care. To do this, it used past healthcare spending as a proxy for health need. However, Black patients historically have had less access to care and therefore have spent less, regardless of their needs. The algorithm concluded that Black patients were healthier than they actually were. This reduced the number of Black patients identified for care by more than 50%.

3. The Feedback Loop

Algorithms are capable of reinforcing and amplifying existing inequalities. When a biased system makes a decision, that decision becomes a new data point that can be fed back into future versions of the model, making the initial bias appear more “correct” over time.

Consider an algorithm used for making loan decisions. Trained on historical data reflecting past discrimination, the algorithm denies loans to applicants in a particular neighborhood, using zip code as a proxy for race. This lack of credit and investment contributes to economic decline in that neighborhood. When the algorithm is retrained on newer data, it now sees even stronger statistical evidence that this neighborhood is a “risky” place to lend, causing it to deny even more loans. The algorithm’s biased prediction has become a self-fulfilling prophecy.

Part II: Legal Struggles

The Regulatory Patchwork

In the US, in the absence of a comprehensive federal law governing algorithmic bias, a fragmented regulatory environment has begun to emerge.

State AI laws have come up with an explicit (albeit generic) recognition of algorithmic discrimination. Colorado imposes a duty of care on developers and deployers of “high-risk” AI systems (including those used in employment) to avoid algorithmic discrimination. Illinois has made algorithmic discrimination an actionable civil rights violation. The law also prohibits employers from using zip codes as a proxy for protected classes.

At the federal level, bills like the Algorithmic Fairness Act and the AI Civil Rights Act have been proposed but have not passed, leaving agencies like the Equal Employment Opportunity Commission to lead enforcement through guidance and litigation.

Currently, the primary legal tool deployed against algorithmic discrimination is the theory of disparate impact, established under Title VII of the Civil Rights Act of 1964. If a seemingly neutral rule, such as a pre-employment test or a physical requirement, disproportionately harms a group protected by law (on the basis of race, gender, religion, etc.) and cannot be proven to be a “business necessity,” it is considered unlawful. No proof of discriminatory intent is required.

This framework is now being applied to algorithms, with plaintiffs arguing that automated screening tools function as discriminatory “tests” under Title VII, the Age Discrimination in Employment Act (ADEA), and the Americans with Disabilities Act (ADA).

But these laws fall short of many realities about the tech and its real-life applications.

The Black Box Problem Cripples Plaintiffs

A plaintiff trying to challenge an algorithm, is shut out from the start. First, there is a lack of access to the model and its training data, which are valuable trade secrets. A plaintiff can’t access them to diagnose the source of the bias.

Second, even with access, many models are black boxes due to a lack of interpretability. They identify patterns but can’t explain the logic behind them. An algorithm might simply learn to find candidates who “look like” people hired in the past, effectively laundering historical biases into a seemingly objective process.

Crucially, under the Civil Rights Act, the person alleging discrimination bears the burden of proving that a facially neutral practice has a disproportionate negative impact on a protected class. How is a Taco Bell employee expected to find proof of wrongful termination in a black box algorithm he cannot access nor explain?

Even with a lawyer’s help, traditional legal discovery processes, designed for retrieving documents and deposing human witnesses, are ill-equipped to audit complex, dynamic machine learning models. This additionally creates a crisis of evidence.

The Paradox of “Colorblind” Code

The legal principle of anticlassification holds that protected characteristics like race and gender should not be considered in decision-making, aiming for a “colorblind“ or “gender-neutral” process. This view is gaining traction in American jurisprudence.

However, this is a paradoxical position. To determine if an algorithm is biased, and to subsequently correct that bias, it is technically necessary to use those very protected attributes. One cannot know if a model is less accurate for women unless one can label which users are women and compare the model’s performance across genders

This also creates a catch-22: the very act of performing a bias audit could be seen as legally suspect under a strict anticlassification regime. This legal climate risks incentivizing what ignorance or obfuscation of biases as developers may fear that discovering and attempting to fix a bias could create a record of disparate treatment (intentional discrimination) that exposes them to lawsuits.

No Universal Standard For Fairness

In 2016, ProPublica published a groundbreaking analysis of a proprietary risk-assessment algorithm called COMPAS (Correctional Offender Management Profiling for Alternative Sanctions), which was used by courts across the country to help inform decisions about bail and sentencing. It revealed a disturbing pattern: the model was more likely to incorrectly label Black defendants as high-risk (false positives) and more likely to incorrectly label white defendants as low-risk (false negatives).

But Northpointe, the company behind COMPAS, argued that its algorithm was in fact fair. This argument stems from two fundamentally different definitions of “fairness”:

Predictive Parity (Northpointe’s Defense): This metric asks: for any given risk score, is the probability of recidivism the same regardless of race? If the model says someone is high-risk, the probability they re-offend should be the same whether they are Black or white. By this measure, COMPAS was fair. Its predictions were roughly equally accurate for both groups. This definition prioritizes the system’s efficiency and consistency; it ensures that the label “high-risk” has a uniform meaning for decision-makers like judges or parole boards.
Equalized Odds (ProPublica’s Critique): This metric focuses on error rates, asking: does the algorithm make mistakes at the same rate for different groups? It looks at the false positive rate (the rate at which people who will not re-offend are incorrectly labeled “high-risk”) and the false negative rate. By this measure, COMPAS failed spectacularly. Black defendants who did not go on to re-offend were nearly twice as likely to be misclassified as high-risk compared to their white counterparts (45% vs. 23%). This definition prioritizes protecting individuals from wrongful harm, arguing that a fair system should not impose the severe consequences of a false “high-risk” label disproportionately on one group.

Here’s the mathematical catch: when the underlying base rates of an outcome (recidivism) differ between two populations, it is impossible for an algorithm to satisfy both predictive parity and equalized odds simultaneously. If the training data (in this case, arrest records, which are a proxy for crime, not crime itself) shows different recidivism rates for different racial groups, you must choose which fairness metric to violate.

In criminal justice, a society founded on the principle of “innocent until proven guilty” may decide that the harm of a false positive (wrongfully labeling someone as high-risk for re-offense) is far greater than the harm of a false negative. Conversely, in public health, an algorithm designed to predict a pandemic might prioritize minimizing false negatives (failing to identify a contagious person) to prevent widespread harm.

A choice must be made. This is not a technical bug to be fixed but a fundamental trade-off between competing values traversing ethics, politics and society.

Part III: Suggestions

To govern the algorithmic age, the law must address three fundamental challenges: the paradox of “colorblind” code, the opaqueness of the “black box,” and the need to treat fairness as a societal mandate.

First, the law should recognize that “fairness-aware” programming is not the same as disparate treatment. This could involve creating a legal safe harbor for good-faith efforts to audit and mitigate bias, protecting developers and employers who proactively use data of protected classes to ensure their systems are equitable.

Second, the law must mandate technical transparency. For high-stakes AI systems, this should include requirements for clear explanations of decisions, public disclosure of bias audits, and the right for independent, third-party audits for fairness and accuracy. More critically, the law must shift the burden of proof. Currently, under the Civil Rights Act, a plaintiff alleging disparate impact bears the herculean task of demonstrating how a secret algorithm caused a discriminatory outcome.A more just framework would require that once a plaintiff makes a prima facie case of a discriminatory statistical impact, the burden shifts to the employer or vendor.

Third, the law should not anoint any sole statistical metric as the “correct” definition of fairness. Instead, its role must be to create a procedural framework that forces this debate into the open. It must establish rules for public deliberation and democratic oversight, ensuring that these crucial value judgments are made transparently by society, not opaquely by private companies.

AI systems learn from the world as it is, not as it ought to be. Let’s fix the world by using AI instead of blindly blaming it.

Effective Altruism Forum
EA Forum